State-of-the-art language models are often accurate on many question-answering benchmarks with well-defined questions. Yet, in real settings questions are often unanswerable without asking the user for clarifying information. We show that current SotA models often do not ask the user for clarification when presented with imprecise questions and instead provide incorrect answers or "hallucinate". To address this, we introduce CLAM, a framework that first uses the model to detect ambiguous questions, and if an ambiguous question is detected, prompts the model to ask the user for clarification. Furthermore, we show how to construct a scalable and cost-effective automatic evaluation protocol using an oracle language model with privileged information to provide clarifying information. We show that our method achieves a 20.15 percentage point accuracy improvement over SotA on a novel ambiguous question-answering answering data set derived from TriviaQA.
translated by 谷歌翻译
The Elo algorithm, due to its simplicity, is widely used for rating in sports competitions as well as in other applications where the rating/ranking is a useful tool for predicting future results. However, despite its widespread use, a detailed understanding of the convergence properties of the Elo algorithm is still lacking. Aiming to fill this gap, this paper presents a comprehensive (stochastic) analysis of the Elo algorithm, considering round-robin (one-on-one) competitions. Specifically, analytical expressions are derived characterizing the behavior/evolution of the skills and of important performance metrics. Then, taking into account the relationship between the behavior of the algorithm and the step-size value, which is a hyperparameter that can be controlled, some design guidelines as well as discussions about the performance of the algorithm are provided. To illustrate the applicability of the theoretical findings, experimental results are shown, corroborating the very good match between analytical predictions and those obtained from the algorithm using real-world data (from the Italian SuperLega, Volleyball League).
translated by 谷歌翻译
Compartmental models are a tool commonly used in epidemiology for the mathematical modelling of the spread of infectious diseases, with their most popular representative being the Susceptible-Infected-Removed (SIR) model and its derivatives. However, current SIR models are bounded in their capabilities to model government policies in the form of non-pharmaceutical interventions (NPIs) and weather effects and offer limited predictive power. More capable alternatives such as agent based models (ABMs) are computationally expensive and require specialized hardware. We introduce a neural network augmented SIR model that can be run on commodity hardware, takes NPIs and weather effects into account and offers improved predictive power as well as counterfactual analysis capabilities. We demonstrate our models improvement of the state-of-the-art modeling COVID-19 in Austria during the 03.2020 to 03.2021 period and provide an outlook for the future up to 01.2024.
translated by 谷歌翻译
Convolutional neural networks (CNN) define the state-of-the-art solution on many perceptual tasks. However, current CNN approaches largely remain vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system while being quasi-imperceptible to the human eye. In recent years, various approaches have been proposed to defend CNNs against such attacks, for example by model hardening or by adding explicit defence mechanisms. Thereby, a small "detector" is included in the network and trained on the binary classification task of distinguishing genuine data from data containing adversarial perturbations. In this work, we propose a simple and light-weight detector, which leverages recent findings on the relation between networks' local intrinsic dimensionality (LID) and adversarial attacks. Based on a re-interpretation of the LID measure and several simple adaptations, we surpass the state-of-the-art on adversarial detection by a significant margin and reach almost perfect results in terms of F1-score for several networks and datasets. Sources available at: https://github.com/adverML/multiLID
translated by 谷歌翻译
We propose a novel approach for deep learning-based Multi-View Stereo (MVS). For each pixel in the reference image, our method leverages a deep architecture to search for the corresponding point in the source image directly along the corresponding epipolar line. We denote our method DELS-MVS: Deep Epipolar Line Search Multi-View Stereo. Previous works in deep MVS select a range of interest within the depth space, discretize it, and sample the epipolar line according to the resulting depth values: this can result in an uneven scanning of the epipolar line, hence of the image space. Instead, our method works directly on the epipolar line: this guarantees an even scanning of the image space and avoids both the need to select a depth range of interest, which is often not known a priori and can vary dramatically from scene to scene, and the need for a suitable discretization of the depth space. In fact, our search is iterative, which avoids the building of a cost volume, costly both to store and to process. Finally, our method performs a robust geometry-aware fusion of the estimated depth maps, leveraging a confidence predicted alongside each depth. We test DELS-MVS on the ETH3D, Tanks and Temples and DTU benchmarks and achieve competitive results with respect to state-of-the-art approaches.
translated by 谷歌翻译
For conceptual design, engineers rely on conventional iterative (often manual) techniques. Emerging parametric models facilitate design space exploration based on quantifiable performance metrics, yet remain time-consuming and computationally expensive. Pure optimisation methods, however, ignore qualitative aspects (e.g. aesthetics or construction methods). This paper provides a performance-driven design exploration framework to augment the human designer through a Conditional Variational Autoencoder (CVAE), which serves as forward performance predictor for given design features as well as an inverse design feature predictor conditioned on a set of performance requests. The CVAE is trained on 18'000 synthetically generated instances of a pedestrian bridge in Switzerland. Sensitivity analysis is employed for explainability and informing designers about (i) relations of the model between features and/or performances and (ii) structural improvements under user-defined objectives. A case study proved our framework's potential to serve as a future co-pilot for conceptual design studies of pedestrian bridges and beyond.
translated by 谷歌翻译
This paper presents a proof-of-concept method for classifying chemical compounds directly from NMR data without doing structure elucidation. This can help to reduce time in finding good structure candidates, as in most cases matching must be done by a human engineer, or at the very least a process for matching must be meaningfully interpreted by one. Therefore, for a long time automation in the area of NMR has been actively sought. The method identified as suitable for the classification is a convolutional neural network (CNN). Other methods, including clustering and image registration, have not been found suitable for the task in a comparative analysis. The result shows that deep learning can offer solutions to automation problems in cheminformatics.
translated by 谷歌翻译
在本文中,我们考虑了通过风险最小化监督学习中变异模型的问题。我们的目标是通过双层优化和通过算法展开对学习变异模型的两种方法进行更深入的了解。前者将变分模型视为低于风险最小化问题的较低级别优化问题,而后者将较低级别优化问题替换为解决上述问题的算法。两种方法都在实践中使用,但是从计算的角度来看,展开要简单得多。为了分析和比较两种方法,我们考虑了一个简单的玩具模型,并明确计算所有风险和各自的估计器。我们表明,展开可能比双重优化方法更好,而且展开的性能可以显着取决于进一步的参数,有时会以意外的方式:虽然展开的算法的步骤大小很重要,但展开的迭代数量只有很重要如果数字是偶数或奇数,并且这两种情况截然不同。
translated by 谷歌翻译
一致性检查是一种过程挖掘技术,允许验证过程实例与给定模型的符合性。因此,该技术被预定在医学环境中用于将治疗案例与临床准则进行比较。但是,医学过程是高度可变,高度动态和复杂的。这使得难以在医疗领域中使用命令性一致性检查方法。研究表明,声明性方法可以更好地解决这些特征。但是,这些方法尚未获得实际接受。另一个挑战是对齐,通常不会从医学角度增加任何价值。因此,我们在案例研究中调查了HL7标准Arden语法对于宣言性,基于规则的符合度检查和使用手动建模的对齐方式的可用性。使用该方法,可以检查治疗案例的一致性,并为医疗指南的大部分地区创建有意义的对齐方式。
translated by 谷歌翻译
我们提出了一种使用合理的心形和现实外观合成心脏MR图像的方法,目的是生成标记的数据进行深度学习(DL)训练。它将图像合成分解为标签变形和标签到图像翻译任务。前者是通过VAE模型中的潜在空间插值来实现的,而后者是通过条件GAN模型完成的。我们设计了一种在受过训练的VAE模型的潜在空间中的标记操纵方法,即病理合成,旨在合成一系列具有所需心脏病特征的伪病理合成受试者。此外,我们建议通过估计潜在矢量之间的相关系数矩阵来对2D切片之间的关系进行建模,并利用它在解码到图像空间之前将样品随机绘制的元素关联。这种简单而有效的方法导致从2D片段产生3D一致的受试者。这种方法可以提供一种解决方案,以多样化和丰富心脏MR图像的可用数据库,并为开发基于DL的图像分析算法的开发铺平道路。该代码将在https://github.com/sinaamirrajab/cardiacpathologysynthesis中找到。
translated by 谷歌翻译